Pipeline Processor
Q21.
Data forwarding techniques can be used to speed up the operation in presence of data dependencies. Consider the following replacements of LHS with RHS. i. R1\rightarrow Loc, Loc\rightarrow R2 \; \equiv \; R1\rightarrow R2, R1 \rightarrow Loc ii. R1\rightarrow Loc, Loc\rightarrow R2 \; \equiv \; R1\rightarrow R2 iii. R1\rightarrow Loc, R2 \rightarrow Loc \; \equiv \; R1\rightarrow Loc iv. R1\rightarrow Loc, R2 \rightarrow Loc \; \equiv \; R2\rightarrow Loc In which of the following options, will the result of executing the RHS be the same as executing the LHS irrespective of the instructions that follow ?Q22.
Consider the following reservation table for a pipeline having three stages S1,S2 and S3. The minimum average latency (MAL) is ________.Q23.
Consider a pipelined processor with the following four stages: IF: Instruction Fetch ID: Instruction Decode and Operand Fetch EX: Execute WB: Write Back The IF, ID and WB stages take one clock cycle each to complete the operation. The number of clock cycles for the EX stage depends on the instruction. The ADD and SUB instructions need 1 clock cycle and the MUL instruction needs 3 clock cycles in the EX stage. Operand forwarding is used in the pipelined processor. What is the number of clock cycles taken to complete the following sequence of instructions? \begin{array}{lllll} \textbf{ADD} & \text{R2, R1, R0} &&& \text{R2 $\leftarrow$ R1+R0} \\ \textbf{MUL} & \text{R4, R3, R2} &&& \text{R4 $\leftarrow$ R3*R2} \\ \textbf{SUB} & \text{R6, R5, R4} &&& \text{R6 $\leftarrow$ R5-R4} \\ \end{array}Q24.
We have two designs D1 and D2 for a synchronous pipeline processor. D1 has 5 pipeline stages with execution times of 3 nsec, 2 nsec, 4 nsec, 2 nsec and 3 nsec while the design D2 has 8 pipeline stages each with 2 nsec execution time How much time can be saved using design D2 over design D1 for executing 100 instructions?Q25.
Delayed branching can help in the handling of control hazards The following code is to run on a pipelined processor with one branch delay slot: I1: ADD \leftarrowR2 R7 +R8 I2 : SUB R4 \leftarrowR5 - R6 I3: ADD R1 \leftarrow R2 + R3 I4 : STORE Memory [R4] \leftarrow R1 BRANCH to Label if R1==0 Which of the instructions I1, I2, I3 or I4 can legitimately occupy the delay slot without any other program modification?Q26.
Delayed branching can help in the handling of control hazards For all delayed conditional branch instructions, irrespective of whether the condition evaluates to true or falseQ27.
Consider an instruction pipeline with four stages (S1, S2, S3 and S4) each with combinational circuit only. The pipeline registers are required between each stage and at the end of the last stage. Delays for the stages and for the pipeline registers are as given in the figure. What is the approximate speed up of the pipeline in steady state under ideal conditions when compared to the corresponding non-pipeline implementation?Q28.
A pipeline P operating at 400 MHz has a speedup factor of 6 and operating at 70% efficiency. How many stages are there in the pipeline?Q29.
A non pipelined single cycle processor operating at 100 MHz is converted into a synchronous pipelined processor with five stages requiring 2.5 nsec, 1.5 nsec, 2 nsec, 1.5 nsec and 2.5 nsec, respectively. The delay of the latches is 0.5 nsec. The speedup of the pipeline processor for a large number of instructions is:Q30.
The floating point unit of a processor using a design D takes 2t cycles compared to t cycles taken by the fixed point unit. There are two more design suggestions D_1 and D_2. D_1 uses 30% more cycles for fixed point unit but 30% less cycles for floating point unit as compared to design D. D_2 uses 40% less cycles for fixed point unit but 10% more cycles for floating point unit as compared to design D. For a given program which has 80% fixed point operations and 20% floating point operations, which of the following ordering reflects the relative performances of three designs? (D_i > D_j denotes that D_i is faster than D_j)